Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Sensors (Basel) ; 23(17)2023 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-37687768

RESUMEN

Human pose estimation is an important Computer Vision problem, whose goal is to estimate the human body through joints. Currently, methods that employ deep learning techniques excel in the task of 2D human pose estimation. However, the use of 3D poses can bring more accurate and robust results. Since 3D pose labels can only be acquired in restricted scenarios, fully convolutional methods tend to perform poorly on the task. One strategy to solve this problem is to use 2D pose estimators, to estimate 3D poses in two steps using 2D pose inputs. Due to database acquisition constraints, the performance improvement of this strategy can only be observed in controlled environments, therefore domain adaptation techniques can be used to increase the generalization capability of the system by inserting information from synthetic domains. In this work, we propose a novel method called Domain Unified approach, aimed at solving pose misalignment problems on a cross-dataset scenario, through a combination of three modules on top of the pose estimator: pose converter, uncertainty estimator, and domain classifier. Our method led to a 44.1mm (29.24%) error reduction, when training with the SURREAL synthetic dataset and evaluating with Human3.6M over a no-adaption scenario, achieving state-of-the-art performance.


Asunto(s)
Aclimatación , Ambiente Controlado , Humanos , Bases de Datos Factuales , Incertidumbre
2.
Artículo en Inglés | MEDLINE | ID: mdl-38737297

RESUMEN

The standard benchmark metric for 3D face reconstruction is the geometric error between reconstructed meshes and the ground truth. Nearly all recent reconstruction methods are validated on real ground truth scans, in which case one needs to establish point correspondence prior to error computation, which is typically done with the Chamfer (i.e., nearest neighbor) criterion. However, a simple yet fundamental question have not been asked: Is the Chamfer error an appropriate and fair benchmark metric for 3D face reconstruction? More generally, how can we determine which error estimator is a better benchmark metric? We present a meta-evaluation framework that uses synthetic data to evaluate the quality of a geometric error estimator as a benchmark metric for face reconstruction. Further, we use this framework to experimentally compare four geometric error estimators. Results show that the standard approach not only severely underestimates the error, but also does so inconsistently across reconstruction methods, to the point of even altering the ranking of the compared methods. Moreover, although non-rigid ICP leads to a metric with smaller estimation bias, it could still not correctly rank all compared reconstruction methods, and is significantly more time consuming than Chamfer. In sum, we show several issues present in the current benchmarking and propose a procedure using synthetic data to address these issues.

3.
IEEE Trans Affect Comput ; 13(4): 1813-1826, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36452255

RESUMEN

We propose an automatic method to estimate self-reported pain based on facial landmarks extracted from videos. For each video sequence, we decompose the face into four different regions and the pain intensity is measured by modeling the dynamics of facial movement using the landmarks of these regions. A formulation based on Gram matrices is used for representing the trajectory of landmarks on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. A curve fitting algorithm is used to smooth the trajectories and temporal alignment is performed to compute the similarity between the trajectories on the manifold. A Support Vector Regression classifier is then trained to encode extracted trajectories into pain intensity levels consistent with self-reported pain intensity measurement. Finally, a late fusion of the estimation for each region is performed to obtain the final predicted pain level. The proposed approach is evaluated on two publicly available datasets, the UNBCMcMaster Shoulder Pain Archive and the Biovid Heat Pain dataset. We compared our method to the state-of-the-art on both datasets using different testing protocols, showing the competitiveness of the proposed approach.

4.
Sensors (Basel) ; 22(4)2022 Feb 16.
Artículo en Inglés | MEDLINE | ID: mdl-35214430

RESUMEN

Automatic facial expression recognition is essential for many potential applications. Thus, having a clear overview on existing datasets that have been investigated within the framework of face expression recognition is of paramount importance in designing and evaluating effective solutions, notably for neural networks-based training. In this survey, we provide a review of more than eighty facial expression datasets, while taking into account both macro- and micro-expressions. The proposed study is mostly focused on spontaneous and in-the-wild datasets, given the common trend in the research is that of considering contexts where expressions are shown in a spontaneous way and in a real context. We have also provided instances of potential applications of the investigated datasets, while putting into evidence their pros and cons. The proposed survey can help researchers to have a better understanding of the characteristics of the existing datasets, thus facilitating the choice of the data that best suits the particular context of their application.


Asunto(s)
Expresión Facial , Reconocimiento Facial , Cara , Redes Neurales de la Computación
5.
IEEE Trans Pattern Anal Mach Intell ; 44(2): 848-863, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-32750786

RESUMEN

In this work, we propose a novel approach for generating videos of the six basic facial expressions given a neutral face image. We propose to exploit the face geometry by modeling the facial landmarks motion as curves encoded as points on a hypersphere. By proposing a conditional version of manifold-valued Wasserstein generative adversarial network (GAN) for motion generation on the hypersphere, we learn the distribution of facial expression dynamics of different classes, from which we synthesize new facial expression motions. The resulting motions can be transformed to sequences of landmarks and then to images sequences by editing the texture information using another conditional Generative Adversarial Network. To the best of our knowledge, this is the first work that explores manifold-valued representations with GAN to address the problem of dynamic facial expression generation. We evaluate our proposed approach both quantitatively and qualitatively on two public datasets; Oulu-CASIA and MUG Facial Expression. Our experimental results demonstrate the effectiveness of our approach in generating realistic videos with continuous motion, realistic appearance and identity preservation. We also show the efficiency of our framework for dynamic facial expressions generation, dynamic facial expression transfer and data augmentation for training improved emotion recognition models.


Asunto(s)
Expresión Facial , Redes Neurales de la Computación , Algoritmos , Cara , Movimiento (Física)
6.
IEEE Trans Pattern Anal Mach Intell ; 44(10): 6667-6682, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-34156937

RESUMEN

The 3D Morphable Model (3DMM) is a powerful statistical tool for representing 3D face shapes. To build a 3DMM, a training set of face scans in full point-to-point correspondence is required, and its modeling capabilities directly depend on the variability contained in the training data. Thus, to increase the descriptive power of the 3DMM, establishing a dense correspondence across heterogeneous scans with sufficient diversity in terms of identities, ethnicities, or expressions becomes essential. In this manuscript, we present a fully automatic approach that leverages a 3DMM to transfer its dense semantic annotation across raw 3D faces, establishing a dense correspondence between them. We propose a novel formulation to learn a set of sparse deformation components with local support on the face that, together with an original non-rigid deformation algorithm, allow the 3DMM to precisely fit unseen faces and transfer its semantic annotation. We extensively experimented our approach, showing it can effectively generalize to highly diverse samples and accurately establish a dense correspondence even in presence of complex facial expressions. The accuracy of the dense registration is demonstrated by building a heterogeneous, large-scale 3DMM from more than 9,000 fully registered scans obtained by joining three large datasets together.


Asunto(s)
Algoritmos , Reconocimiento de Normas Patrones Automatizadas , Cara/diagnóstico por imagen , Imagenología Tridimensional , Semántica
7.
J Imaging ; 7(12)2021 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-34940724

RESUMEN

Estimating the 3D shape of objects from monocular images is a well-established and challenging task in the computer vision field. Further challenges arise when highly deformable objects, such as human faces or bodies, are considered. In this work, we address the problem of estimating the 3D shape of a human body from single images. In particular, we provide a solution to the problem of estimating the shape of the body when the subject is wearing clothes. This is a highly challenging scenario as loose clothes might hide the underlying body shape to a large extent. To this aim, we make use of a parametric 3D body model, the SMPL, whose parameters describe the body pose and shape of the body. Our main intuition is that the shape parameters associated with an individual should not change whether the subject is wearing clothes or not. To improve the shape estimation under clothing, we train a deep convolutional network to regress the shape parameters from a single image of a person. To increase the robustness to clothing, we build our training dataset by associating the shape parameters of a "minimally clothed" person to other samples of the same person wearing looser clothes. Experimental validation shows that our approach can more accurately estimate body shape parameters with respect to state-of-the-art approaches, even in the case of loose clothes.

8.
Artículo en Inglés | MEDLINE | ID: mdl-34651145

RESUMEN

We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory representations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank. Curve fitting and temporal alignment were then used to smooth the extracted trajectories. A Support Vector Regression model was then trained to encode the extracted trajectories into ten pain intensity levels consistent with the Visual Analogue Scale for pain intensity measurement. The proposed approach was evaluated using the UNBC McMaster Shoulder Pain Archive and was compared to the state-of-the-art on the same data. Using both 5-fold cross-validation and leave-one-subject-out cross-validation, our results are competitive with respect to state-of-the-art methods.

9.
Sensors (Basel) ; 21(2)2021 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-33467595

RESUMEN

Facial Action Units (AUs) correspond to the deformation/contraction of individual facial muscles or their combinations. As such, each AU affects just a small portion of the face, with deformations that are asymmetric in many cases. Generating and analyzing AUs in 3D is particularly relevant for the potential applications it can enable. In this paper, we propose a solution for 3D AU detection and synthesis by developing on a newly defined 3D Morphable Model (3DMM) of the face. Differently from most of the 3DMMs existing in the literature, which mainly model global variations of the face and show limitations in adapting to local and asymmetric deformations, the proposed solution is specifically devised to cope with such difficult morphings. During a training phase, the deformation coefficients are learned that enable the 3DMM to deform to 3D target scans showing neutral and facial expression of the same individual, thus decoupling expression from identity deformations. Then, such deformation coefficients are used, on the one hand, to train an AU classifier, on the other, they can be applied to a 3D neutral scan to generate AU deformations in a subject-independent manner. The proposed approach for AU detection is validated on the Bosphorus dataset, reporting competitive results with respect to the state-of-the-art, even in a challenging cross-dataset setting. We further show the learned coefficients are general enough to synthesize realistic 3D face instances with AUs activation.

10.
Artículo en Inglés | MEDLINE | ID: mdl-30281437

RESUMEN

In this paper, we propose a novel space-time geometric representation of human landmark configurations and derive tools for comparison and classification. We model the temporal evolution of landmarks as parametrized trajectories on the Riemannian manifold of positive semidefinite matrices of fixed-rank. Our representation has the benefit to bring naturally a second desirable quantity when comparing shapes-the spatial covariance-in addition to the conventional affine-shape representation. We derived then geometric and computational tools for rate-invariant analysis and adaptive re-sampling of trajectories, grounding on the Riemannian geometry of the underlying manifold. Specifically, our approach involves three steps: (1) landmarks are first mapped into the Riemannian manifold of positive semidefinite matrices of fixed-rank to build time-parameterized trajectories; (2) a temporal warping is performed on the trajectories, providing a geometry-aware (dis-)similarity measure between them; (3) finally, a pairwise proximity function SVM is used to classify them, incorporating the (dis-)similarity measure into the kernel function. We show that such representation and metric achieve competitive results in applications as action recognition and emotion recognition from 3D skeletal data, and facial expression recognition from videos. Experiments have been conducted on several publicly available up-to-date benchmarks.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Movimiento/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Puntos Anatómicos de Referencia/diagnóstico por imagen , Bases de Datos Factuales , Emociones/fisiología , Humanos , Máquina de Vectores de Soporte , Grabación en Video
11.
IEEE Trans Neural Netw Learn Syst ; 31(10): 3892-3905, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-31725395

RESUMEN

In this article, we propose a new approach for facial expression recognition (FER) using deep covariance descriptors. The solution is based on the idea of encoding local and global deep convolutional neural network (DCNN) features extracted from still images, in compact local and global covariance descriptors. The space geometry of the covariance matrices is that of symmetric positive definite (SPD) matrices. By conducting the classification of static facial expressions using a support vector machine (SVM) with a valid Gaussian kernel on the SPD manifold, we show that deep covariance descriptors are more effective than the standard classification with fully connected layers and softmax. Besides, we propose a completely new and original solution to model the temporal dynamic of facial expressions as deep trajectories on the SPD manifold. As an extension of the classification pipeline of covariance descriptors, we apply SVM with valid positive definite kernels derived from global alignment for deep covariance trajectories classification. By performing extensive experiments on the Oulu-CASIA, CK+, static facial expression in the wild (SFEW), and acted facial expressions in the wild (AFEW) data sets, we show that both the proposed static and dynamic approaches achieve the state-of-the-art performance for FER outperforming many recent approaches.

12.
Artículo en Inglés | MEDLINE | ID: mdl-30059306

RESUMEN

Face recognition "in the wild" has been revolutionized by the deployment of deep learning based approaches. In fact, it has been extensively demonstrated that Deep Convolutional Neural Networks (DCNNs) are powerful enough to overcome most of the limits that affected face recognition algorithms based on hand-crafted features. These include variations in illumination, pose, expression and occlusion, to mention some. The DCNNs discriminative power comes from the fact that low- and high-level representations are learned directly from the raw image data. As a consequence, we expect the performance of a DCNN to be influenced by the characteristics of the image/video data that are fed to the network, and their preprocessing. In this work, we present a thorough analysis of several aspects that impact on the use of DCNN for face recognition. The evaluation has been carried out from two main perspectives: the network architecture and the similarity measures used to compare deeply learned features; the data (source and quality) and their preprocessing (bounding box and alignment). Results obtained on the IJB-A, MegaFace, UMDFaces and YouTube Faces datasets indicate viable hints for designing, training and testing DCNNs. Taking into account the outcomes of the experimental evaluation, we show how competitive performance with respect to the state-of-the-art can be reached even with standard DCNN architectures and pipeline.

13.
IEEE Trans Cybern ; 45(7): 1340-52, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25216492

RESUMEN

Recognizing human actions in 3-D video sequences is an important open problem that is currently at the heart of many research domains including surveillance, natural interfaces and rehabilitation. However, the design and development of models for action recognition that are both accurate and efficient is a challenging task due to the variability of the human pose, clothing and appearance. In this paper, we propose a new framework to extract a compact representation of a human action captured through a depth sensor, and enable accurate action recognition. The proposed solution develops on fitting a human skeleton model to acquired data so as to represent the 3-D coordinates of the joints and their change over time as a trajectory in a suitable action space. Thanks to such a 3-D joint-based framework, the proposed solution is capable to capture both the shape and the dynamics of the human body, simultaneously. The action recognition problem is then formulated as the problem of computing the similarity between the shape of trajectories in a Riemannian manifold. Classification using k-nearest neighbors is finally performed on this manifold taking advantage of Riemannian geometry in the open curve shape space. Experiments are carried out on four representative benchmarks to demonstrate the potential of the proposed solution in terms of accuracy/latency for a low-latency action recognition. Comparative results with state-of-the-art methods are reported.


Asunto(s)
Imagenología Tridimensional/métodos , Aprendizaje Automático , Movimiento/fisiología , Reconocimiento de Normas Patrones Automatizadas/métodos , Fotograbar/métodos , Imagen de Cuerpo Entero/métodos , Actigrafía/métodos , Humanos , Interpretación de Imagen Asistida por Computador/métodos , Grabación en Video/métodos
14.
IEEE Trans Image Process ; 24(1): 220-35, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25398180

RESUMEN

In this paper, we present a novel and original framework, which we dubbed mesh-local binary pattern (LBP), for computing local binary-like-patterns on a triangular-mesh manifold. This framework can be adapted to all the LBP variants employed in 2D image analysis. As such, it allows extending the related techniques to mesh surfaces. After describing the foundations, the construction and the main features of the mesh-LBP, we derive its possible variants and show how they can extend most of the 2D-LBP variants to the mesh manifold. In the experiments, we give evidence of the presence of the uniformity aspect in the mesh-LBP, similar to the one noticed in the 2D-LBP. We also report repeatability experiments that confirm, in particular, the rotation-invariance of mesh-LBP descriptors. Furthermore, we analyze the potential of mesh-LBP for the task of 3D texture classification of triangular-mesh surfaces collected from public data sets. Comparison with state-of-the-art surface descriptors, as well as with 2D-LBP counterparts applied on depth images, also evidences the effectiveness of the proposed framework. Finally, we illustrate the robustness of the mesh-LBP with respect to the class of mesh irregularity typical to 3D surface-digitizer scans.

15.
IEEE Trans Cybern ; 44(12): 2443-57, 2014 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-25415949

RESUMEN

In this paper, we present an automatic approach for facial expression recognition from 3-D video sequences. In the proposed solution, the 3-D faces are represented by collections of radial curves and a Riemannian shape analysis is applied to effectively quantify the deformations induced by the facial expressions in a given subsequence of 3-D frames. This is obtained from the dense scalar field, which denotes the shooting directions of the geodesic paths constructed between pairs of corresponding radial curves of two faces. As the resulting dense scalar fields show a high dimensionality, Linear Discriminant Analysis (LDA) transformation is applied to the dense feature space. Two methods are then used for classification: 1) 3-D motion extraction with temporal Hidden Markov model (HMM) and 2) mean deformation capturing with random forest. While a dynamic HMM on the features is trained in the first approach, the second one computes mean deformations under a window and applies multiclass random forest. Both of the proposed classification schemes on the scalar fields showed comparable results and outperformed earlier studies on facial expression recognition from 3-D video sequences.


Asunto(s)
Inteligencia Artificial , Biometría/métodos , Cara/anatomía & histología , Expresión Facial , Imagenología Tridimensional/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Algoritmos , Aumento de la Imagen/métodos , Interpretación de Imagen Asistida por Computador/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Grabación en Video/métodos
16.
IEEE Trans Pattern Anal Mach Intell ; 32(12): 2162-77, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-20975115

RESUMEN

In this paper, we present a novel approach to 3D face matching that shows high effectiveness in distinguishing facial differences between distinct individuals from differences induced by nonneutral expressions within the same individual. The approach takes into account geometrical information of the 3D face and encodes the relevant information into a compact representation in the form of a graph. Nodes of the graph represent equal width isogeodesic facial stripes. Arcs between pairs of nodes are labeled with descriptors, referred to as 3D Weighted Walkthroughs (3DWWs), that capture the mutual relative spatial displacement between all the pairs of points of the corresponding stripes. Face partitioning into isogeodesic stripes and 3DWWs together provide an approximate representation of local morphology of faces that exhibits smooth variations for changes induced by facial expressions. The graph-based representation permits very efficient matching for face recognition and is also suited to being employed for face identification in very large data sets with the support of appropriate index structures. The method obtained the best ranking at the SHREC 2008 contest for 3D face recognition. We present an extensive comparative evaluation of the performance with the FRGC v2.0 data set and the SHREC08 data set.


Asunto(s)
Identificación Biométrica/métodos , Cara/anatomía & histología , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Bases de Datos Factuales , Femenino , Humanos , Masculino , Análisis de Componente Principal
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...